91 research outputs found

    Sharing deep generative representation for perceived image reconstruction from human brain activity

    Full text link
    Decoding human brain activities via functional magnetic resonance imaging (fMRI) has gained increasing attention in recent years. While encouraging results have been reported in brain states classification tasks, reconstructing the details of human visual experience still remains difficult. Two main challenges that hinder the development of effective models are the perplexing fMRI measurement noise and the high dimensionality of limited data instances. Existing methods generally suffer from one or both of these issues and yield dissatisfactory results. In this paper, we tackle this problem by casting the reconstruction of visual stimulus as the Bayesian inference of missing view in a multiview latent variable model. Sharing a common latent representation, our joint generative model of external stimulus and brain response is not only "deep" in extracting nonlinear features from visual images, but also powerful in capturing correlations among voxel activities of fMRI recordings. The nonlinearity and deep structure endow our model with strong representation ability, while the correlations of voxel activities are critical for suppressing noise and improving prediction. We devise an efficient variational Bayesian method to infer the latent variables and the model parameters. To further improve the reconstruction accuracy, the latent representations of testing instances are enforced to be close to that of their neighbours from the training set via posterior regularization. Experiments on three fMRI recording datasets demonstrate that our approach can more accurately reconstruct visual stimuli

    Auditory Attention Decoding with Task-Related Multi-View Contrastive Learning

    Full text link
    The human brain can easily focus on one speaker and suppress others in scenarios such as a cocktail party. Recently, researchers found that auditory attention can be decoded from the electroencephalogram (EEG) data. However, most existing deep learning methods are difficult to use prior knowledge of different views (that is attended speech and EEG are task-related views) and extract an unsatisfactory representation. Inspired by Broadbent's filter model, we decode auditory attention in a multi-view paradigm and extract the most relevant and important information utilizing the missing view. Specifically, we propose an auditory attention decoding (AAD) method based on multi-view VAE with task-related multi-view contrastive (TMC) learning. Employing TMC learning in multi-view VAE can utilize the missing view to accumulate prior knowledge of different views into the fusion of representation, and extract the approximate task-related representation. We examine our method on two popular AAD datasets, and demonstrate the superiority of our method by comparing it to the state-of-the-art method

    Multi-view Multi-label Fine-grained Emotion Decoding from Human Brain Activity

    Full text link
    Decoding emotional states from human brain activity plays an important role in brain-computer interfaces. Existing emotion decoding methods still have two main limitations: one is only decoding a single emotion category from a brain activity pattern and the decoded emotion categories are coarse-grained, which is inconsistent with the complex emotional expression of human; the other is ignoring the discrepancy of emotion expression between the left and right hemispheres of human brain. In this paper, we propose a novel multi-view multi-label hybrid model for fine-grained emotion decoding (up to 80 emotion categories) which can learn the expressive neural representations and predicting multiple emotional states simultaneously. Specifically, the generative component of our hybrid model is parametrized by a multi-view variational auto-encoder, in which we regard the brain activity of left and right hemispheres and their difference as three distinct views, and use the product of expert mechanism in its inference network. The discriminative component of our hybrid model is implemented by a multi-label classification network with an asymmetric focal loss. For more accurate emotion decoding, we first adopt a label-aware module for emotion-specific neural representations learning and then model the dependency of emotional states by a masked self-attention mechanism. Extensive experiments on two visually evoked emotional datasets show the superiority of our method.Comment: Accepted by IEEE Transactions on Neural Networks and Learning System

    Semi-supervised Deep Generative Modelling of Incomplete Multi-Modality Emotional Data

    Full text link
    There are threefold challenges in emotion recognition. First, it is difficult to recognize human's emotional states only considering a single modality. Second, it is expensive to manually annotate the emotional data. Third, emotional data often suffers from missing modalities due to unforeseeable sensor malfunction or configuration issues. In this paper, we address all these problems under a novel multi-view deep generative framework. Specifically, we propose to model the statistical relationships of multi-modality emotional data using multiple modality-specific generative networks with a shared latent space. By imposing a Gaussian mixture assumption on the posterior approximation of the shared latent variables, our framework can learn the joint deep representation from multiple modalities and evaluate the importance of each modality simultaneously. To solve the labeled-data-scarcity problem, we extend our multi-view model to semi-supervised learning scenario by casting the semi-supervised classification problem as a specialized missing data imputation task. To address the missing-modality problem, we further extend our semi-supervised multi-view model to deal with incomplete data, where a missing view is treated as a latent variable and integrated out during inference. This way, the proposed overall framework can utilize all available (both labeled and unlabeled, as well as both complete and incomplete) data to improve its generalization ability. The experiments conducted on two real multi-modal emotion datasets demonstrated the superiority of our framework.Comment: arXiv admin note: text overlap with arXiv:1704.07548, 2018 ACM Multimedia Conference (MM'18

    MindDiffuser: Controlled Image Reconstruction from Human Brain Activity with Semantic and Structural Diffusion

    Full text link
    Reconstructing visual stimuli from brain recordings has been a meaningful and challenging task. Especially, the achievement of precise and controllable image reconstruction bears great significance in propelling the progress and utilization of brain-computer interfaces. Despite the advancements in complex image reconstruction techniques, the challenge persists in achieving a cohesive alignment of both semantic (concepts and objects) and structure (position, orientation, and size) with the image stimuli. To address the aforementioned issue, we propose a two-stage image reconstruction model called MindDiffuser. In Stage 1, the VQ-VAE latent representations and the CLIP text embeddings decoded from fMRI are put into Stable Diffusion, which yields a preliminary image that contains semantic information. In Stage 2, we utilize the CLIP visual feature decoded from fMRI as supervisory information, and continually adjust the two feature vectors decoded in Stage 1 through backpropagation to align the structural information. The results of both qualitative and quantitative analyses demonstrate that our model has surpassed the current state-of-the-art models on Natural Scenes Dataset (NSD). The subsequent experimental findings corroborate the neurobiological plausibility of the model, as evidenced by the interpretability of the multimodal feature employed, which align with the corresponding brain responses.Comment: arXiv admin note: substantial text overlap with arXiv:2303.1413

    Evaluation of multiple voxel-based morphometry approaches and applications in the analysis of white matter changes in temporal lobe epilepsy

    Get PDF
    Abstract. The purpose of this study was to compare multiple voxel-based morphometry (VBM) approaches and analyze the whole-brain white matter (WM) changes in the unilateral temporal lobe epilepsy (TLE) patients relative to controls. In our study, the performance of the VBM approaches, including standard VBM, optimized VBM and VBM-DARTEL, was evaluated via a simulation, and then these VBM approaches were applied to the real data obtained from the TLE patients and controls. The results from simulation show that VBM-DARTEL performs the best among these VBM approaches. For the real data, WM reductions were found in the ipsilateral temporal lobe, the contralateral frontal and occipital lobes, the bilateral parietal lobes, cingulated gyrus, parahippocampal gyrus and brainstem of the left-TLE patients by VBM-DARTEL, which is consistent with previous studies. Our study demonstrated that DARTEL was the most robust and reliable approach for VBM analysis

    Glutamatergic and Resting-State Functional Connectivity Correlates of Severity in Major Depression – The Role of Pregenual Anterior Cingulate Cortex and Anterior Insula

    Get PDF
    Glutamatergic mechanisms and resting-state functional connectivity alterations have been recently described as factors contributing to major depressive disorder (MDD). Furthermore, the pregenual anterior cingulate cortex (pgACC) seems to play an important role for major depressive symptoms such as anhedonia and impaired emotion processing. We investigated 22 MDD patients and 22 healthy subjects using a combined magnetic resonance spectroscopy (MRS) and resting-state functional magnetic resonance imaging (fMRI) approach. Severity of depression was rated using the 21-item Hamilton depression scale (HAMD) and patients were divided into severely and mildly depressed subgroups according to HAMD scores. Because of their hypothesized role in depression we investigated the functional connectivity between pgACC and left anterior insular cortex (AI). The sum of Glutamate and Glutamine (Glx) in the pgACC, but not in left AI, predicted the resting-state functional connectivity between the two regions exclusively in depressed patients. Furthermore, functional connectivity between these regions was significantly altered in the subgroup of severely depressed patients (HAMD > 15) compared to healthy subjects and mildly depressed patients. Similarly the Glx ratios, relative to Creatine, in the pgACC were lowest in severely depressed patients. These findings support the involvement of glutamatergic mechanisms in severe MDD which are related to the functional connectivity between pgACC and AI and depression severity
    • …
    corecore